BITS Meetings' Virtual Library

BITS Meetings' Virtual Library:
Abstracts from Italian Bioinformatics Meetings from 1999 to 2013

766 abstracts overall from 11 distinct proceedings

Display Abstracts | Brief :: Order by Meeting | First Author Name

1. Avogadri R, Valentini G, Bertoni A
An integrated algorithmic procedure for the assessment and discovery of clusters in DNA microarray data
Meeting: BITS 2009 - Year: 2009
Full text in a new tab
Topic: Novel methods and algorithms

Abstract: Missing

2. Bertoni A, Folgieri R, Ruffino F, Valentini G
Assessment of clusters reliability for high dimensional genomic data
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Discovering new subclasses of pathologies and expression signatures related to specific phenotypes are challenging problems in the context of gene expression data analysis. To pursue these objectives, we need to estimate the “natural” number and the stability of the discovered clusters. To this end, new approaches based on random subspaces and bootstrap methods have been recently proposed.

3. Cesa-Bianchi N, Valentini G
Genome-wide hierarchical classification of gene function
Meeting: BITS 2009 - Year: 2009
Full text in a new tab
Topic: Novel methods and algorithms

Abstract: Missing

4. Folgieri R, Bertoni A, Valentini G
An experimental comparison of Random Subspace and Random Projection Ensembles of Support Vector Machines for the classification of gene expression data.
Meeting: BITS 2007 - Year: 2007
Full text in a new tab
Topic: Large scale analysis of experimental data

Abstract: Missing

5. Muselli M, Ruffino F, Valentini G
An Artificial Model for Validating Gene Selection Methods
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Unspecified

Abstract: Every DNA microarray experiment provides thousands of real values that correspond to the gene expression levels of a tissue. This technology can offer a new valuable tool for medical diagnosis, since it can yield a reliable way to determine the state of a patient (e.g. healthy or ill) by measuring the gene expression level of its cells. The dataset obtained through several microarray experiments can be represented by a table with m rows and n columns: each of its rows is associated with an examined tissues and each column corresponds to one of the considered genes. To specify a particular state for each tissue, a final column must be added to the table. Typically m ~ 100, while n ~ 10000. When analyzing this table to retrieve a model for diagnosis, we have two different targets: besides finding a method that recognizes the state pertaining to a specific tissue (discrimination), we wish to determine the genes involved in this prediction (gene selection). The quality of the discrimination task can be simply estimated through a measure of accuracy, obtained by proper methods (hold-out, cross validation, etc.). On the contrary, it is very difficult to evaluate the results of the gene selection process, since the genes really involved in the onset of a state are actually unknown. A possible way of validating gene selection could be to analyze the performance of the considered method on a diagnosis problem where significant genes are known. Unfortunately, at the present no problem of this kind is available. An alternative approach consists in building an artificial model, starting from proper biological motivations, that generates data having the same statistical characteristics of gene expression levels produced by microarray experiments. As proposed in [1], the behavior of a biological system can be described through regulatory networks that represent the interaction between different genes. The nodes and the edges of these networks are ruled by dynamic equations that involve the concentration of products encoded by genes and consequently the gene expression levels. Each concentration is expressed through a real variable that changes with time and can determine the transition of the system from a state to another. When the organism is in a particular state some concentrations are lower than a given threshold (specific for each gene), while others exceed a proper value. Thus, if we select a definite state, we can say that a gene is in the active state, if its expression level has a value consistent (lower or greater than a specific threshold) with that state. With this definition each gene can be described by a binary variable, assuming value 1 if the gene is active and 0 otherwise. Also the presence of the considered state can be expressed through a Boolean variable, which takes the value 1, if the tissue is in that state, and 0 otherwise. Consequently, the whole biological system can be described by a Boolean function f with n inputs. Each of the m available microarray experiments corresponds to a particular entry of the truth table for the function f; it is formed by an input-output pair (x,y), where x is a vector of n binary values associated with the examined genes and y is a binary value asserting if the corresponding tissue is in the considered state or not. According to this setting, a technique to generate artificial data for validating gene selection methods consists in building a proper Boolean function f, whose truth table entries share the same statistical characteristics of gene expression levels produced by microarray experiments. Then, the quality of the gene selection method is measured by the percentage of significant genes retrieved. Although each Boolean function can be described by a logical expression containing only AND, OR and NOT operations, in our case it is more convenient to obtain f in a different way. In fact, it can be observed that in biological systems genes can be assembled into groups of expression signatures, i.e. subsets of coordinately expressed genes related to specific biological functions. These groups of genes are, in some sense, equivalent with respect to the state determination. Thus, the Boolean function f can be viewed as a combination of several groups of genes. Each group is considered active if a sufficiently large number of its genes is active. Then, the function f assumes value 1 if the number of active groups exceeds a given threshold. A proper algorithm for constructing Boolean functions with these characteristics has been implemented. It is able to generate data resembling those produced by several microarray experiments for diagnostic purpose. In these cases two or more different states are analyzed and the algorithm constructs a specific Boolean function (adopting the above approach) for each state. Then, to allow the application of the gene selection method, a set of input-output pairs is produced for each Boolean function built. The algorithm includes several parameters that can be tuned to achieve a good agreement between the resulting collection of input-output pairs and the dataset produced by microarray experiments for a specific problem. An evaluation of this agreement can be obtained by looking at the accuracy values scored by a discriminant method for different numbers of considered genes. In this contribution, the Leukemia dataset has been considered and a proper artificial model has been generated by constructing a specific Boolean function for each of the two variants of leukemia examined. Figure 1 shows the accuracy values obtained through the leave-one-out approach by applying the SVM-RFE method described in and the technique proposed in. As one can note, the agreement between the success rate curves is excellent in both situations.

6. Pavesi G, Valentini G
Classification of co-expressed genes from DNA regulatory regions
Meeting: BITS 2007 - Year: 2007
Full text in a new tab
Topic: Novel methodologies, algorithms and tools

Abstract: Missing

7. Pavesi G, Valentini G, Mauri G, Pesole G
Motif based classification of coexpressed genes
Meeting: BITS 2006 - Year: 2006
Full text in a new tab
Topic: Recognition of genes and regulatory elements

Abstract: Missing

8. Re M, Valentini G
Data fusion based gene function prediction using ensemble methods
Meeting: BITS 2009 - Year: 2009
Full text in a new tab
Topic: Networks and Systems Biology

Abstract: Missing

9. Valentini G, Bertoni A
A statistical test based on the Bernstein inequality to discover multi-level structures in bio-molecular data.
Meeting: BITS 2007 - Year: 2007
Full text in a new tab
Topic: Novel methodologies, algorithms and tools

Abstract: Missing

BITS Meetings' Virtual Library
driven by Librarian 1.3 in PHP, MySQL^TM and Apache environment.

For information, email to paolo.dm.romano@gmail.com .